Secure data storage and exchange processes

A digital solution that exchanges data over the Internet needs to consider three main security risks:

  • confidentiality (vs unauthorized access)
  • integrity (vs malicious corruption)
  • availability (vs interference)
Secure data storage and exchange processes

The main techniques developers use to help protect against these threats are:

  • encryption (via decryption key)
  • authorization (via signature, code or token)
  • checksums (checking for data corruption)
  • hashing (checking for malicious interference)
Authentication

Process of verifying someone is who they say they are. Methods include:

  • Username / password
    • One-step / two-step
  • Digital signatures
    • Digest Access Authentication - MD5(username:realm:password) .. Produces a hash
    • XML signature - <signature><method><hashvalue> … etc
  • Application tokens
    • OAuth2: authentication protocol that allows you to approve one application interacting with another on your behalf without giving away your password
Encryption

the process through which data is encoded so that it remains hidden from or inaccessible to unauthorized users. It helps protect private information, sensitive data, and can enhance the security of communication between client apps and servers.

Encryption

Encryption transforms data in an unreadable format.

  • plaintext (unencrypted data)
  • ciphertext (encrypted data)
  • encryption or decryption process (algorithm)
  • key (used in the encryption and decryption process)
Hashing (hash function): one-way cryptographic algorithm that takes an input message of arbitrary length and produces a fixed-length digest that can't be decoded (or is extremely difficult to decode). Hash algorithms are designed to be one way algorithms, so that hashed values don't need to be 'read', they only need to be 'matched'. Example use: storing passwords in a database.
Hashing

Successful hash characteristics:

  • Reliability
    • The same text should produce the same identical hash digest every time
  • One-way
    • Can’t reverse engineer text from hash digest
  • Collision resistance
    • No 2 hash digests should be alike
  • Speed

Salts (unique value that can be added to a word before hashing) can prevent pre-computed hash matching. Examples of hash algorithms include MD5 and SHA (SHA-1, SHA-256, SHA-512)

Encrpytion: converting a string into another string that can be decoded using a key. Example use: sending credit card details via a web application. Encryption is less secure than hashing, but sending an encrypted (or hashed) credit card number to a retailer that can't be decoded is pointless. For that reason, encryption should only ever be used over hashing when it is necessary to decrypt the resulting message.
Salt(ing): appending or prepending a random string (called a salt) to the password before hashing. This ensures that two users with the same password will have two different password hashes. Salts should be of reasonable length (a good rule of thumb is to use salts that are at least equal to or greater than the size of the hash), and never reuse the same salt twice - always generate a new random salt for every new hash (this will need to be stored though, and a link maintained to the hash it belongs too).
Checksum (aka hashsum): an outcome of running a hash function on a piece of data (usually a single file). a checksum can be used to "check" that your data (or file) is the same as what was promised by the source of the data (or file). In this example, there is a clear difference in checksum:

This is a test. >> MD5 HASH CHECKSUM >> 120EA8A25E5D487BF68B5F7096440019

This is a test >> MD5 HASH CHECKSUM >> CE114E4501D2F4E2DCEA3E17B546F339

Alot of terminology is interchangeable:


A checksum (such as CRC32) is to prevent accidental changes. If one byte changes, the checksum changes. The checksum is not safe to protect against malicious changes: it is pretty easy to create a file with a particular checksum.

A hash function maps some data to other data. It is often used to speed up comparisons or create a hash table. Not all hash functions are secure and the hash does not necessarily changes when the data changes.

A cryptographic hash function (such as SHA1) is a checksum that is secure against malicious changes. It is pretty hard to create a file with a specific cryptographic hash.

To make things more complicated, cryptographic hash functions are sometimes simply referred to as hash functions.
Symmetric cryptography uses SAME KEY to encrypt & decrypt.

All of the following block ciphers, including Caeser and Vigenère, are symmetric.
Caeser cipher advantages:
  • one of the easiest methods to use in cryptography and can provide minimum security to the information (good for children or persons who have very little experience with security and encryption)
  • use of only a short key in the entire process
  • one of the best methods to use if the system cannot use any complicated coding techniques
  • requires few computing resources and can be done easily with a pen and paper
Caeser cipher disadvantages:
  • simple structure usage
  • can only provide minimum security to the information
  • frequency of the letter pattern provides a big clue in deciphering the entire message
  • due to the nature of the cipher, an encrypted number sequence has only 10 possibilities with a common key or shift to all numbers.
Block cipher

A block cipher takes a block of plaintext bits and generates a block of ciphertext bits, generally of same size:
see block cipher examples in Python from exam
Vigenère cipher - a polyalphabetic substitution cipher
Vigenère square
key = "XYZ" key is wrapped until same len of msg - "XYZX"
plaintext = "MATE"
ciphertext = "JYSB"
Symmetric Encryption

Well known symmetric block ciphers:

  • Triple DES
    • Encrypts data 3 times using a different key at least once – very slow. Good for PINs in ATMs
  • Blowfish
    • efficient open source algorithm for application encryption
    • splits message into 64 bit blocks and encrypts each block individually
    • Twofish is very similar but uses 128-bit blocks
  • AES
    • US Government Advanced Encryption Standard (128, 192 or 256 bit) replaced DES (64-bit) in year 2000
Commercial Block Cipher: AES-128
  • each key is 128 bits long = 2^128 (combinations)
  • each bit takes 10 math operations = 2^128 * 10 (total combinations)
  • also available in 192 or 256 bits

Asymmetric encryption

Asymmetric cryptography uses DIFFERENT KEYS:

  • to encrypt (public_key)
  • to decrypt (private_key)

Both keys are 'linked' mathematically but it is computational infeasible to calculate the private key from the public key. This is slower but more secure encryption than symmetric cryptography. Public/private key encryption is a viable encryption method that could be used in the transmission of data between a secured section of a website and an end user.

Asymmetric Encryption

A well-known asymmetric block cipher:

  • RSA
    • Public key to encrypt
    • Private key to decrypt
    • Widely used in internet and network security
    • Used in every website you visit with HTTPS: (not always any more)
    • HTTPS uses Secure Socket Layer (SSL) to encrypt data transfers between client and server
    • SSL uses the RSA algorithm
Can you complete the following cognitions:

recognise and describe features of symmetric (Data Encryption Standard — DES, Triple DES, AES — Advanced Encryption Standard, Blowfish and Twofish) and assymetric (RSA) encryption algorithms



no coding these, just be able to recognise and describe as per the next slide.
Once you understand base systems, you can work with any base:
Binary
Base 2: [0,1]
23 22 21 20
8 4 2 1
1 0 1 1
equals 8 + 2 + 1 = decimal 11 (or in hexadecimal: 'b')
4 binary digits (aka bits) gives us 16 different combinations.
In other words, each hexadecimal digit (base 16) can represent four binary digits:

Hexadecimal
Base 16: [0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f]
163 162 161 160
4096 256 16 1
b 8
equals 16*b(11) + 8 = 176 + 8 = decimal 184
This is why 256-bit hash algorithms produce a string that is (256/4)=64 characters long